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Abstract 


The  present  report  outlines  recent  results  relative  to  the  develop¬ 
ment  of  new  linear  and  nonlinear  approaches  for  solving  groundwater 
flow  problems.  Efficient  and  robust  simulation  of  complex  processes 
in  environmental  applications  is  crucial  for  understanding,  forecast¬ 
ing  and  performing  opportune  and  reliable  decisions.  This  research 
effort  focuses  in  developing  efficient  solvers  for  two  main  models  in 
groundwater  flow:  Richards’  equation  and  air-water  coupled  equation 
systems.  To  achieve  this  goal,  two  main  streams  of  research  are  pre¬ 
sented:  (a)  High-order  Newton-Krylov  methods,  and  (b)  Decoupling 
and  constraint  operators  for  preconditioning  coupled  systems  of  non¬ 
linear  equations.  Numerical  results  reveal  that  these  approaches  have 
high  potentials  for  dealing  with  stringent  physical  situations. 


1  Introduction 

High-order  Krylov-Newton  methods  refer  to  a  novel  idea  originally  presented 
in  [12]  for  solving  oil  reservoir  applications  and  that  has  the  potential  to 
accelerate  the  nonlinear  iterations  by  exploiting  the  Krylov  information  gen¬ 
erated  by  the  GMRES  iterative  method.  This  idea  is  for  first  time  applied 
to  solve  Richards’  equation  for  modelling  unsaturated  flow. 

Decoupling  and  constraint  operators  are  the  basis  for  developing  ef¬ 
ficient  two-stage  preconditioners  and  super-coarsening  multigrid  methods 
[5,  12,  16,  17].  The  framework  can  be  used  as  a  starting  point  for  gen¬ 
erating  more  sophisticated  preconditioning  approaches  when  the  discretized 
model  is  coupled  and  involves  the  solution  of  variables  with  different  physical 
behavior  such  as  pressure  head  and  saturation  in  a  water-oil  system. 

This  report  is  organized  as  follows.  We  follow  with  a  brief  presentation 
of  the  Newton-Krylov  framework  that  encompasses  the  solution  of  Richards’ 
equation  and  the  two-phase  problems  in  porous  media.  In  sections  3  and 
4,  we  describe  how  this  framework  can  be  extended  to  increase  nonlinear 
convergence  by  means  of  secant  updates  restricted  to  the  Krylov  subspace. 
Sections  5,  6,  7  and  8  focus  the  discussion  on  decoupling  and  constraint  oper¬ 
ators  and  how  they  can  be  used  as  a  basis  for  two-stage  preconditioning  and 
semi-  supercoarsening  multigrid.  Section  9  show  some  preliminary  numerical 
results.  We  conclude  with  some  extra  discussions  and  possible  extensions  to 
this  work.  Extended  descriptions  of  these  results  will  be  available  in  two 
separate  ICES  technical  reports  [13,  14]. 
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2  Newton-Krylov  framework 

The  solution  of  the  nonlinear  system  of  equations 

*»=0,  (1) 

where  F  :  ff  C  — >  Mn,  is  cornerstone  in  many  scientific  and  engineer¬ 

ing  applications.  A  variety  of  discretization  techniques  applied  to  systems 
of  coupled  nonlinear  partial  differential  equations  gives  rise  to  an  algebraic 
system  such  as  (1),  where  u  is  the  unknown  vector  representing  the  nodal 
values  of  the  state  variable. 

In  our  particular  case,  this  nonlinear  system  of  equations  arises  from  dis¬ 
cretizing  either  Richards’  equation  or  the  coupled  partial  differential  equa¬ 
tions  describing  the  air-water  system.  The  nonlinearities  in  both  equations 
are  present  in  the  temporal  term  and  the  diffusive  term  and  when  fully  im¬ 
plicit  schemes  are  employed  the  transient  and  steady-state  cases  lead  to  a 
nonlinear  algebraic  system  such  as  (1).  Newton’s  method  implies  the  solution 
of  the  Jacobian  system 

J  (u)  5u  =  —F  (u)  ,  (2) 

where  F  (u)  and  J  (u)  denote  the  evaluation  of  the  function  at  u  £  M"  and 
its  derivative  at  any  Newton  step,  respectively. 

Due  to  the  high  cost  or  absence  of  derivatives  for  the  explicit  construction 
of  the  Jacobian  J,  the  action  of  this  operator  onto  vector  (in  the  form  of 
matrix-vector  products)  can  be  performed  by  finite  differences.  This  is,  in 
fact,  one  of  the  most  appealing  features  of  Newton-Krylov  methods.  These 
methods  are  also  amenable  to  be  globalized  by  line-search  or  trust  region 
strategies  and  for  dynamic  control  of  linear  tolerances  (i.e. ,  forcing  terms). 
The  reader  is  referred  to  [15]  to  obtain  an  state  of  the  art  discussion  on 
the  subject.  Specific  theory  supporting  the  description  of  the  line-search 
backtracking  method  and  the  selection  of  the  forcing  terms  to  control  the 
convergence  of  the  linear  solver  at  each  nonlinear  step  can  be  found  in  [9, 
11,  21].  A  globalized  Newton-Krylov  method  with  a  line-search  backtracking 
procedure  can  be  stated  as  follows: 

Algorithm  2.1  (Newton-Krylov  algorithm  with  forcing  terms  and  line-search 
backtracking) 

1.  Let  u ®  be  an  initial  guess 

2.  For  k  =  0, 1, 2, . . .  until  convergence  do 
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2.1  Choose  7 /fc)  G  [0, 1) ,  t  €  (0, 1) 

2.2  Using  some  Krylov  subspace  iterative  method,  compute  a  vector 
s ®  satisfying 

=  _^(fc)  +  rWj  (3) 

1 1  r1^)  II  /  ;  \ 

with  tt— zU  ,.  I,,  <  rfk'. 

2.3  While  +  s(fc))||2  >  [l  -t  (l  ||F(fc)||  do 

2.3.1  Choose  A  to  minimize  a  quadratic  polynomial  over  A,  A  C 
(0, 1)  that  interpolates 

F  (A)  =  ||K(n(fc)  +  A.s(fc))||2. 

2.3.2  Update  =  A sik\ 

2.4  Update  =  1  —  A  ^1  — 

3.  Update  solution  =  u ^  +  s^k\ 

Some  observations  are  in  order. 


•  The  parameter  t  is  usually  small,  t  =  10  4  (i.e.,  a  small  margin  be¬ 
tween  the  predicted  and  actual  reduction  of  F ^  is  assumed  small). 

•  An  interval  value  A,  A  =  [.1,  .5]  is  typically  chosen. 

•  A  quadratic  interpolant  p,  is  constructed  in  such  a  way  that  p  (0)  = 

F  (0)  =  2,  p{l)  =  F  (1)  =  F^  +  sW  2 and  p'  (0)  =  F'  (0)  = 

2  (FW,JWSW). 

The  forcing  term  r is  selected  according  to  the  following  criterion: 
r/(fc)  =  min 1 77max , max 1 77^ ,  ||  , 


where 

p(k)  _  4  )s(fc-!) 

«(fc)  =  _ _ _ 

||F(fc-1)  || 

The  choice  of  rj ^  reflects  the  agreement  between  and  its  linear  model 
at  the  previous  iteration.  Thus,  the  linear  solver  tolerance  is  larger  when 
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the  Newton  step  is  less  likely  to  be  productive  and  smaller  when  the  step  is 
more  likely  to  lead  to  a  good  approximation.  This  operation  is  performed 
once  the  backtracking  procedure  has  returned  the  adequate  nonlinear  step 
size  and  update  of  the  solution. 


3  The  family  of  rank— one  solvers 

Rank-one  updates  for  solving  nonlinear  equation  are  sometimes  referred  as 
secant  or  quasi-Newton  methods  since  no  “true”  Newton  equation  is  ever 
formed  throughout  all  iterations.  These  updates  obey  a  secant  condition 
that  has  to  be  satisfied  by  the  new  Jacobian  approximation.  The  best  of 
these  methods  still  seems  to  be  the  first  one  originally  introduced  by  Broyden 
(see  e.g.,  [6]). 


3.1  Broyden’s  method 

Given  u  ~  u*  and  A  ~  J  ( u ) ,  we  can  find  an  approximate  new  Newton  step, 
u+,  by 

u+  =  u  —  A~lF  (u) .  (4) 

Broyden’s  method  computes  a  new  A+  by  means  of  the  following  rank- 
one  update 

A+  =  A+lIlNtNANNil,  (5) 

gls 

which,  whenever  the  approximated  Jacobian  equation  is  solved  exactly,  i.e. 
As  =  —F  («),  it  reduces  to 


A+ 


A  + 


F+  (u)gf 
gls 


for  gts  ^  0.  The  vector  g  can  be  chosen  in  several  ways.  For  instance,  when 
g  =  s,  we  obtain  the  “good  Broyden’s  update”  and  when  g  =  At  [i?+  ( u )  —  F  (it)], 
we  have  the  “bad  Broyden’s  update”.  The  algorithm  can  be  depicted  as  fol¬ 
lows: 


Algorithm  3.1  (Nonlinear  Broyden) 

1.  Give  an  initial  guess  u ^  and  Jacobian  approximation  Aq. 

2.  For  k  =  0,  lj —  until  convergence  do 

2.1  Solve  A^s^  =  -F(kl 
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2.2  Update  solution  u(k+1'1  =  u ^  +  s^k\ 

2.3  =  F^k+1^  —  F^k\ 


2.4  A^k+Y 


A (fe)  + 


(q(fc)_A(fc)s(fc))(s(fc))4 

(s(fc))G(fc) 


3.2  The  Nonlinear  Eirola-Nevanlinna  (NEN)  method 

The  nonlinear  Eirola-Nevanlinna  (NEN)  was  proposed  by  [25]  as  the  nonlin¬ 
ear  counterpart  of  the  linear  EN  algorithm  [8].  This  method  is  a  generaliza¬ 
tion  of  Broyden’s  method  for  solving  linear/nonlinear  equations.  Interest¬ 
ingly  enough,  the  algorithm  has  connections  with  the  GMRES  algorithm  [23]. 
The  NEN  algorithm  can  be  regarded  as  the  composition  of  two  Broyden’s 
iterations  as  the  following  presentation  suggests: 


Algorithm  3.2  (Nonlinear  EN) 

1.  Give  an  initial  guess  u ^  and  Jacobian  approximation  Mq. 

2.  For  k  =  0, 1, . . .  until  convergence  do 


2.1  Solve  M(k)s(k)  =  -UW. 

2.2  qW  =  F^k+Y  —  F^k\ 

2.3  >■ 

2.4  Solve  A'f(k+1)s(k)  =  -F(k\ 

2.5  Update  solution  =  u Y')  +  . 


Notice  that  the  direction  computed  by  the  NEN  algorithm  is  a  linear 
combination  of  the  direction  delivered  by  Broyden’s  method  and  an  extra 
direction  coming  from  step  2.4.  It  can  be  shown  that  the  NEN  algorithm 
converges  n-step  g-quadratically  for  n-dimensional  problems.  Therefore,  the 
NEN  method  converges  twice  faster  than  Broyden’s  method. 


4  High-order  Newton-Krylov  methods 

Consider  again  A  as  an  approximation  to  the  current  Jacobian  matrix  J . 
We  are  interested  in  looking  at  a  minimum  change  to  A  consistent  with 
A+s  =  F+  —  F  restricted  to  the  underlying  Krylov  subspace.  A  basis  for  this 
subspace  arises  as  result  of  using  an  iterative  linear  solver  such  as  GMRES 
for  solving  the  approximated  Jacobian  system  with  A. 
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4.1  Secant  updates  onto  the  Krylov  subspace 

The  Arnoldi  factorization  provides  valuable  information  that  should  not  be 
discarded  at  all  every  time  a  GMRES  solution  starts  over.  We  now  show 
how  to  reflect  secant  updates  on  the  Jacobian  matrix  without  altering  the 
current  Krylov  basis.  For  the  sake  of  simplicity,  let  us  omit  the  sources 
of  inexactness  induced  by  the  use  of  GMRES  whose  relative  residuals  are 
supposed  to  converge  at  a  predefined  tolerance  (i.e.,  to  a  prescribed  forcing 
term  value). 

Consider  the  solution  to  the  following  approximated  Jacobian  equation 
at  the  fcth  nonlinear  iteration 


M(fc)S(fc)  =  -F{k\  (6) 

with  l  steps  of  the  GMRES  algorithm.  This  linear  solution  can  be  regarded 
as  embedded  in  an  inexact  Broyden’s  method.  Let 

be  the  solution  obtained.  The  associated  Krylov  subspace  for  this  problem 
is  given  by  /cjk^  (^A^k\  •  Now,  we  wish  to  use  the  information  gathered 

during  the  solution  of  (6)  to  provide  an  approximation  to  the  system 

j^(k+l)  g(fc+l)  _  _p(k+ 1) 

with  corresponding  Krylov  basis  K,[k\A^k+1\  Clearly,  in  general  we 

can  not  guarantee  that  K^k+l\A(k+1\  =  JC[k\A(k\  ) .  However, 

rank-one  updates  onto  the  corresponding  Arnoldi  factorization  of  (6)  can  be 
done  without  destroying  the  Krylov  basis.  That  is, 

(A(C  +  vwzwt  V ^  =  V «  (tff  +  zw')  +  h^+Unv^+  ref,  (8) 

or  equivalently, 


(V^y  +  K(fcW  K(fc)  =  H\k)  +  zw\ 


(9) 


for  any  vectors  z,  w  £  M.1.  Expression  (9)  suggests  a  clearer  way  to  update 
Hjk>  rather  than  A^k\  Note  that  the  current  Jacobian  approximation  ap¬ 
pears  to  be  updated  by  a  rank-one  matrix  whose  range  lies  on  K,^ 

Before  proceeding,  it  would  be  convenient  to  express  the  secant  equation 

y4(fc+1)S(fc)  =  ^?(fc+1)  _  (XO) 


7 


in  terms  of  a  solution  lying  strictly  on  the  Krylov  subspace.  Otherwise,  this 
would  introduce  an  implicit  secant  equation  in  terms  of  A^k+1\  To  remove 
the  shift  from  the  origin,  we  reformulate  (6)  as 

A^s(k)  =  - F <*)  -  A^s{k)  =  rjfc),  (11) 

and  redefine  the  final  solution  as  sjk^  =  l/(fc)y(fe),  that  is,  as  if  the  initial 
guess  were  zero.  Obviously,  the  associated  Krylov  basis  is  the  same  depicted 
above.  Therefore,  the  secant  equation  (10)  becomes 


A(k+l)g(k)  =  F(k+ 1)  +  r(0;  (12) 

for  s ^  =  1 Ak)y(k) .  Multiplying  both  sides  by  (v(k^  it  readily  follows  that 
H[  ’  should  satisfy  the  following  secant  equation 


H, 


(k+l)y(k)  =  (y(k)y  F(k+1)  +  peu 


(13) 


where  (3  = 


Hence,  the  Krylov  subspace  projected  version  of  the 


secant  equation  (10)  can  be  written  as 


H, 


(k+ 1) 


=  iT(fc)  + 


(v^ 


'  _p(fc+i) 


/3e i  -  H\k)y ^ 


(14) 


4.2  The  nonlinear  KEN  algorithm 

We  are  now  in  a  position  to  describe  an  inexact  NEN  algorithm  that  exploits 
the  information  left  behind  by  the  GMRES  method.  Hence,  we  introduce  the 
nonlinear  Krylov-Eirola-Nevanlinna  (nonlinear  KEN)  algorithm  as  follows. 


Algorithm  4.1  (Nonlinear  KEN) 

1.  Give  an  initial  guess  u ^  and  Jacobian  approximation  A^ . 

2.  For  k  =  0, 1, . . .  until  convergence  do 


2.1 


s0)  y(fc)  R  = 

b  >  y  i  n l  j  v l  ■>  ll‘m+l,mi  r1  — 


=GMRES 


_F(k)  g(k) 


2.2  qW  =  (v^y  F(k+V  +  fie i. 


2.3  H, 


(  n(k)—H^v(kY\(y(k)Y 

(fc+i)  =  H{k)  \q  y  )yy  ) 

1  (y(fe))Vfc) 


2.4  Solve 


4.3  A  high-order  Newton— Krylov  algorithm 

A  even  faster  version  of  the  nonlinear  KEN  algorithm  can  be  attained  by 
performing  rank-one  updates  of  the  Hessenberg  matrix  as  far  as  possible 
before  making  the  next  GMRES  call.  The  extend  of  these  updates  is  de¬ 
termined  by  the  capability  the  residual  minimization  problem  in  producing 
descent  directions.  In  this  opportunity,  we  abandon  simultaneous  Jacobian 
and  Hessenberg  updates  and  check  instead  if  a  sufficient  amount  of  decrease 
of  ||E||  is  delivered  by  verifying  the  condition  state  at  step  2.3  of  Algorithm 
2.1. 

This  presentation  allows  us  to  illustrate  further  uses  of  the  Krylov- 
Broyden  update  (14) in  the  context  of  Newton’s  method.  We  stress  that 
further  updates  to  the  Hessenberg  matrix  may  result  in  relative  less  over¬ 
head  for  a  higher  order  version  of  the  inexact  Newton  method  than  a  possible 
faster  nonlinear  KEN  algorithm.  The  point  is  that  the  latter  one  requires 
simultaneous  updates  of  and  A ^  which  may  readily  increase  the  total 
number  of  updates.  Of  course,  this  may  be  a  desirable  situation  in  terms  of 
rapid  convergence  updates  but  it  may  turn  out  to  be  expensive  in  terms  of 
computer  memory  use.  Letting  /3  =  7q  ,  the  algorithm  can  be  outlined  as 
follows. 
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Algorithm  4.2  (High-Order  Newton- Krylov,  HONK) 
1.  Give  an  initial  guess  u ^  and  define  rmax- 


2.  For  k  =  0, 1, . . .  until  convergence  do 


2.1  [s(fe),  y(fe),  H[k\  V^,  h^ip  /?]  =GMRES(  J^k\  -F^k\s^ 

2.2  r  =  0. 

2.3  Repeat 

2.3.1  q(k+r^  =  (V^  F(k+r+V  +  (3e i. 

( „{k+r)_H(k+r)  tk+r)\,  (k+r)\t 

(fc+r+i)  =  H(k+i)  \q  ni  y  ){y  ) 


2.3.2  H, 

2.3.3  Solve 


nun 


(AW/V'J 


^y(k+r)yy(k+r 


o  i  T r(^+l+r) 

(3ei+Hl  y 


with  nf +r+1) 


//, 


(fc+r+l) 
l 


nl+l,lCl 


Denote  its  solution  by  y(fc+r+1). 

2.3.4  S(fc+Hi)  _  pdfc)y(fc+2+i). 

2.3.5  t  =  r+  1. 

2.4  Until  (?’  =  rmax)  OR  is  not  a  decreasing  step  for  ||iUfc+r) 

2.5  if  s(fc+U  is  a  decreasing  step  for  ||iHfc+r)||  then 

2.5.1  -|-  g(^+r  max ) 

2.6  else 

2.6.1  tt(fc+1)  =  «(*)  +  s(fc+r- !). 


3.  EndFor 

This  algorithm  can  be  devised  as  a  variant  of  the  composite  Newton’s 
method  that  seek  chord  directions  belonging  to  the  underlying  Krylov  sub¬ 
space.  The  faster  version  of  the  nonlinear  KEN  algorithm  can  be  easily  stated 
from  the  above  presentation  by  just  including  the  Krylov-Broyden  update  of 
A<'k'>  within  the  repeat  loop  statement  2.3.  This  version  should  be  appeal¬ 
ing  in  situations  where  Broyden’s  method  or  the  nonlinear  EN  are  effective 
compared  to  Newton’s  method.  Figure  1  summarizes  the  functionality  of  the 
HONK  algorithm. 
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Figure  1:  Schematic  representation  of  the  HONK  method. 

5  The  air-water  case:  A  coupled  algebraic  system 
of  equations 

By  applying  mixed  finite  element  (equivalent  to  cell-centered  finite  differ¬ 
ences  under  suitable  conditions,  see  e.g.  [7])  in  space  and  a  Euler  formulation 
in  time,  we  arise  to  a  2  X  2  block  linear  system  for  solving  the  changes  in 
water  pressures  and  water  saturations.  More  precisely,  equation  2  turns  out 
to  be 

<15) 

where  Rw,Rg  represents  the  value  of  the  water  equation  residual  and  gas 
equation  residual,  respectively.  Here,  the  nonlinear  function  F  (u)  :  Mm  — > 
M,n,  where  m  is  twice  the  number  of  gridblocks  of  the  discretized  domain, 
holds  the  value  of  the  residuals  at  the  current  iteration  level.  The  matrix 
J  (u)  is  known  as  the  Jacobian  matrix  and  represents  the  derivatives  of  the 
nonlinear  function  F  (u)  with  respect  each  value  of  u  at  every  gridblock 
ijk.  We  now  provide  general  description  of  the  linear  systems  (i.e. ,  Newton 
equation)  arising  in  step  2.2  of  Algorithm  2.1. 

According  to  (15)  each  block  =  p,  s  is  of  size  nb  x  nb,  where  nb,  is 

the  total  number  of  grid  blocks,  so  that  m  =  2  *nb.  Each  group  of  unknowns 
is  numbered  in  a  sequential  lexicographic  fashion:  the  pressure  unknowns 
are  numbered  from  one  through  the  total  number  of  grid  blocks  7ib  and  the 
saturations  are  numbered  from  nb+ 1  through  m.  These  observations  follow: 

•  The  matrix  J  is  indefinite  and  nonsymmetric. 
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•  Each  block  has  the  same  7-diagonal  structure. 

•  The  block  Jpp ,  containing  pressure  coefficients,  resembles  the  structure 
of  a  purely  elliptic  problem. 

•  The  block  Jps,  has  a  structure  similar  to  that  of  a  discretized  first-order 
hyperbolic  problem  in  the  gas  phase  saturations 

•  The  block  Jsp  has  the  coefficients  of  a  convection-free  parabolic  prob¬ 
lem  in  the  non-wetting  phase  pressure 

•  The  block  Jss  represents  a  parabolic  (convection-diffusion)  problem  for 
water  saturations. 

•  The  blocks  JppJps,  JSp  and  —  Jss  show  diagonal  dominance,  positive  di¬ 
agonal  entries  and  nonpositive  off-diagonal  entries  along  their  diagonal 
(i.e. ,  they  are  Z-matrices)  and  are  irreducible  (see  e.g.,  [1]  on  further 
implications  of  these  properties). 


6  Decoupling  Operators 


Consider  the  Jacobian  system  shown  in  (15)  and  let  us  define 


D  = 


Dpp  Ds 
Dsp  Ds 


diag  (Jpp)  diag(Jps) 
diag(Jsp)  diag(Jss) 


(16) 


that  is,  a  matrix  2x2  blocks,  each  of  them  containing  the  main  diagonal  of 
the  corresponding  block  of  J.  It  clearly  follows  that 


D~XJ  = 


JJssJpp  Dsn  Jsp  DssJps  DsnJss 


'sn°  sp 


' ss°ps 


JD  jD 
upp  u  sn 
J D  jD 
u sp  u  ss 


=  J 


DppJsp  DspJpp  DppJgS  Dspjps 


D 


(17) 


where  A  =  DppDss  —  DsnDsp.  We  can  observe  that  the  main  diagonal  entries 
of  the  main  diagonal  blocks  are  equal  to  one.  Conversely,  the  main  diagonal 
entries  of  the  off-diagonal  blocks  are  equal  to  zero.  In  fact,  we  can  expect 
that  the  degree  of  coupling  of  the  off-diagonal  blocks  of  J  has  been  reduced 
to  some  extent.  In  physical  terms,  the  decoupling  operator  tends  to  approx¬ 
imate  pressure  coefficients  as  if  saturations  derivatives  were  neglected  in  the 
transmissibility  coefficients.  Hence,  this  is  like  ’’time-lagging”  or  evaluating 
some  transmissibilities  explicitly. 
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The  above  decoupling  operator  admits  an  alternate  or  point-blockwise 
representation.  We  can  associate  the  smaller  matrix  blocks  with  individual 
unknowns  within  the  mesh.  This  means  to  permute  rows  and  columns  in  an 
interleaved  fashion  and  to  number  every  pressure  unknown  followed  by  the 
concentration  unknown  at  the  same  grid  block.  Let  P  be  the  matrix  that 
perform  such  permutation  and  define 


J  =  PJP *  = 


(  Jll  Jl  2 

J21  J22 

V  Jnbl  Jnb2 


Jin  b  \ 
Jnb 

Jnb,nb  J 


(18) 


where 


{Jpp)ij  ( Jps)ij 

( Jsp)ij  ( Jss)ij 


(19) 


is  the  2x2  matrix  representing  the  coupling  between  unknowns.  It  clearly 
follows  for  D  that 


D  =  PDP 1  D”1  =  PD~1Pt.  (20) 

Hence,  D~l  is  a  block  diagonal  matrix  whose  blocks  are  the  inverses  of  the 
Jacobian  blocks  associated  to  the  local  problem  at  each  grid  block.  That  is, 


/  Jn  0  •  •  •  0  \ 

0  J22  ' ' '  0 

V  0  0  ■  ■  ■  Jnb,nb  J 


(21) 


Clearly,  D~lJ  =  PD~1JPt.  In  the  case  of  the  combinative  approach, 
the  coefficients  introducing  the  coupling  with  pressures  are  zeroed  out  within 
the  grid  block  by  Gaussian  elimination  so  that  corresponding  coefficients  at 
neighboring  grid  blocks  are  expected  to  become  small.  To  be  more  precise, 
let 


where 


f  (Wp)  1  J  •••  0  \ 

~  0  (Wp)  2  0 

WP=  .  .  . 

V  0  0  •  •  •  (Wp)nb  J 

{yVp)i  =  Inuxnu  T  ^ ^  Cl J 


(22) 
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with  e\  =  (1,0)*.  Therefore,  the  operator  Wp  is  a  block  diagonal  matrix 
that  removes  the  coupling  in  each  2x2  diagonal  block  with  respect  to  the 
pressure  unknown.  In  fact,  it  readily  follows  that 


e\e!\ 


(Jpp)i,i  °  \ 

(Jsp)i,i  (Jss)i,i  ) 


(23) 


Similarly,  we  could  dehne  an  operator  Ws  with  the  canonical  vector  e2  = 
(0, 1)*.  The  operator  Wp  was  introduced  by  Wallis  in  his  IMPES  two-stage 
preconditioner  [24], 

Letting  JWp  =  WpJ,  the  consecutive  counterpart  Wp  of  the  alternate 
representation  of  the  operator  Wp  is  given  by 


JWp  =  wpJ  =  P?Vp  =  ( PWpP *)  (pjp*) 


f  jWp 

-  I  jpp 

=  jWp 

\  a  sp 


TwP 

jWP 

<Jss 


( PWpP *)  J 


(24) 


Note  that  the  lower  blocks  are  unmodified  as  well  as  the  main  diagonal 
of  the  resulting  pressure  block,  that  is,  Jspp  =  Jsp  and  Jssp  =  Jss- 


7  Two-stage  preconditioners 


In  order  to  reduce  the  already  decoupled  system  to  one  involving  a  particular 
set  of  coefficients,  say,  those  associated  to  pressure  unknowns,  the  operator 
Rp  €  Mnbxm  is  defined  by 


(Rp)ij 


1  if  i  =  k  and  j  =  1  +  2  (k  —  1) , 
0  otherwise 


for  k  =  1,2, .. .  ,nb.  In  this  particular  lexicographic  and  alternate  ordering 
of  unknowns,  we  could  also  dehne  j  =  2  +  2  (k  —  1)  for  Rls  in  order  to  obtain 
the  corresponding  saturation  coefficients. 

Therefore,  to  obtain  the  pressure  system,  the  following  operation  can  be 
applied 

R\JWpRv  =  j%*.  (25) 
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7.1  Combinative  two-stage  preconditioner 

Consider  the  two-stage  combinative  preconditioner  Mcomb  expressed  as  the 
solution  of  the  preconditioned  residual  equation  Mc0m.bv  =  r ■  Then  the 
action  of  the  preconditioner  Mcomb  is  described  by  the  following  steps, 

Algorithm  7.1  (Two-stage  Combinative,  2SComb) 


1.  Solve  the  reduced  pressure  system  ^RpJWp Rp^j  p  =  RpWpr  and  denote 
its  solution  by  p. 

2.  Obtain  expanded  solution  p  =  Rpp. 

3.  Compute  new  residual  f  =  r  —  Jp. 

4.  Precondition  and  correct  v  =  M~lr  +  p. 

The  action  of  the  whole  preconditioner  can  be  compactly  written  as 


v  =  MCombr  =  M" 


I  -  (j  -  m)  rp  ( 


RtpJWpRp 


-i 


RlvWv 


r. 


(26) 


7.2  Additive  two-stage  preconditioner 

With  the  use  of  the  operator  D  1  and  incorporating  the  solution  to  a  reduced 
saturation  system  in  addition  to  the  solution  to  the  reduced  pressure  system 
we  can  improve  the  quality  of  the  previous  preconditioner.  We  propose 
two  different  ways  to  accomplish  this:  additively  and  multiplicatively.  In 
the  following  we  present  both  procedures  for  computing  the  preconditioned 
residual  v  =  MW,  r  and  v  =  M~ultr.  We  denote  JD  =  D~1J.  The  additive 
two-stage  preconditioner  (2SAdd)  is  given  by 

Algorithm  7.2  (Two-stage  Additive,  2SAdd) 

1.  Solve  the  reduced  pressure  system  (^RpJD Rp^j  p  =  RtpD~1r  and  denote 
its  solution  by  p. 

2.  Solve  the  reduced  concentration  system  ^ RtsJDRs'j  c  =  RtsD~1r  and 
denote  its  solution  by  c. 

3.  Obtain  expanded  solutions  p  =  Rpp  and  c  =  Rsc. 

4.  Add  both  approximate  solutions  y  =  p  +  c. 

5.  Compute  new  residual  f  =  D~lr  —  JDy. 

6.  Precondition  and  correct  v  =  M^1r  +  y. 
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7.3  Multiplicative  two-stage  preconditioner 

The  multiplicative  two-stage  preconditioner  (2SMult)  proposes  the  sequen¬ 
tial  treatment  of  the  partially  preconditioned  residuals  instead.  In  algorith¬ 
mic  terms  it  is  given  by 

Algorithm  7.3  (Two-stage  Multiplicative,  2SMult) 

1.  Solve  the  reduced  pressure  system  (]r^pJD  R^j  p  =  RtpD^1r  and  denote 
its  solution  by  p. 

2.  Obtain  expanded  solutions  p  =  Rpp. 

3.  Construct  new  residuals  f  =  D  —  JDp. 

4.  Solve  the  reduced  concentration  system  ^ RtsJDRgSj  c  =  R^r  and  denote 
its  solution  by  c. 

5.  Obtain  expanded  solutions  c  =  Rsc. 

6.  Compute  new  residual  w  =  D~1r  —  JD(c  +  p). 

7.  Precondition  and  correct  v  =  M~1w  +  c  +  p. 


Assuming  that  both  reduced  pressures  and  saturations  are  solved  exactly 
and  introducing  the  notation 

t^R^Rj^Riy1  RjD~\  (27) 

for  l  =  p,s]  the  action  of  these  preconditioners  can  be  characterized  by 


and 


v  =  MJdr  =  M  1 


«  =  =  M-1 


I  - 


I  -  (j°  -  m)  (tp  +  ta ) 

( jd  -  m)  (tp +ts-  tsjty 


(28) 

(29) 


The  difference  between  the  additive  and  multiplicative  preconditioners 
resides  in  the  inclusion  of  the  cross  term  tsJtp  resulting  from  the  computation 
of  a  new  residual  in  step  6  of  Algorithm  7.2. 
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8  Constraint  operators 


Constraint  operators  are  primarily  used  to  solve  the  original  problem  into  a 
smaller  dimensional  space.  In  order  to  decrease  the  computational  require¬ 
ments  of  the  preconditioners  described  above  we  use  constraint  operators. 
These  operators  are  the  basis  for  many  well  known  projection  operators  such 
as:  deflation,  semicoarsening  and  supercoarsening  multigrid  operators. 


Solve  Ax=b 

Approximate 
Inverse  of  coupled 
matrix  A 

Decouple 

system 

Solve 

saturation 

svstem 

Apply 

Couple  approx. 

Solve 

pressure 

svstem 

Compute 

preconditioner 

solution 

residuals 

DD,  multilevel 
supercoarsening 
preconditioners... 


Figure  2:  Schematic  representation  of  two-stage  preconditioning  for  solving 
coupled  systems. 

In  the  case  of  semicoarsening,  the  solution  should  force  the  sum  of  the 
residuals  in  the  collapsed  direction  to  be  zero.  Those  solutions  are  then  ex¬ 
tended  (projected  back)  onto  the  original  dimension  and  new  residuals  are 
formed.  Frequently,  in  order  to  capture  heterogeneities  along  the  collapsed 
direction,  a  general  relaxation  is  performed  on  the  new  residuals.  The  col¬ 
lapsing  is  done,  in  most  cases,  along  the  vertical  direction  (i.e.,  depth.)  Let 
us  denote  the  depth  coordinate  as  k  containing  nz  grid  blocks  along  this 
direction  and  the  plane  coordinates  as  i  and  j  with  nx  and  ny  grid  blocks, 
respectively.  Suppose  a  lexicographic  numbering  of  nb  unknowns  with  each 
one  of  them  occupying  the  position  ( i,j ,  k ).  Hence,  the  collapsing  operation 
can  be  represented  by  a  rectangular  matrix  C  6  ^nbxnx*ny  suc,h  that 

,  .  _  J  1  if  l  =  i  +  (j  —  1)  *  nx  and  m  =  l  +  (k  —  1)  *  nx  *  ny, 

lm  (  0  otherwise 

The  semicoarsening  can  be  described  by  the  following  steps  for  solving,  say, 
the  three  dimensional  algebraic  system  Ty  =  r: 

Algorithm  8.1  (Semicoarsening) 
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1.  Collapse  residuals  by  means  of  the  operator  Cf  and  solve  (C^TC)  w  = 
C\ 

2.  Expand  solution  to  the  original  dimension:  w  =  Cw, 

3.  Compute  new  residuals  r  =  r  —  Tw,  and, 

4.  Perform  some  relaxation  steps  by  a  suitable  stationary  iterative  method 
along  the  collapsed  direction  (e.g.  line  Jacobi,  line  SOR.)  Obtain  z. 

5.  Set  y  =  w  +  z. 


We  now  proceed  to  describe  how  to  incorporate  this  method  in  the  frame¬ 
work  of  two-stage  preconditioners.  Consider  the  stage  of  solving  the  pressure 
part  in  any  of  the  preconditioners  defined  for  the  alternate  ordering  of  un¬ 
knowns.  The  application  of  line  correction,  for  instance,  in  the  combinative 
approach  implies  the  following  computation  in  steps  (1-3)  in  Algorithm  5.5.1, 

r  =  RpWpr  -  JC  (r^R^  _1  R^Wpr, 

and  the  subsequent  application  of  some  suitable  relaxation  steps  on  r.  Here, 
Rp  =  CRp. 

Note  that  none  of  the  operators  Rp,  Rp  or  C  have  to  be  explicitly  formed 
for  implementation  purposes.  Moreover,  approximation  of  the  reduced  ma¬ 
trices  in  pressure  or  concentration  can  be  stored  in  factored  form  prior  to 
the  execution  of  the  outer  linear  solver.  Application  of  Rp,  C  or  any  other 
analogous  reduction  operator  can  be  implicitly  performed  onto  the  residuals 
to  be  preconditioned. 

8.1  Supercoarsening 

The  supercoarsening  multigrid  (SCMG)  method  consist  of  a  x-line  relaxation 
(LSOR  sweeps)  and  the  yz-plane  correction  T.  The  correction  implies  so¬ 
lution  of  2D  aggregated  system  of  equations  in  yz- plane.  This  is  performed 
by  a  2D  multigrid  with  “two  by  two”  coarsening  and  powerful  relaxations  on 
every  level.  The  SCMG  method  features: 

•  Very  low  arithmetical  complexity, 

•  Almost  zero  sum  in  vertical  lines  for  the  residual.  It  is  good  if  the 
number  of  horizontal  mesh  layers  is  not  very  big, 
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•  Convergence  dependence  on  the  number  of  mesh  steps  in  horizontal 
direction 

•  Sensitivity  (not  very  strong)  to  permeabilities, 

•  Very  strong  reduction  of  unknowns  due  to  the  first  stage  aggregation. 


Figure  3:  Super-coarsening  multigrid  preconditioner. 


Full  3D  grid,  super¬ 
coarsening  multigrid 


Coarsened  3D  grid,  line  SOR 


Figure  4:  Additional  correction  for  the  super-coarsening  multigrid  precondi¬ 
tioner. 


Further  details  of  the  SCMG  method  can  be  seen  in  [13,  16,  17]. 


9  Numerical  Experiments 

The  first  case  involves  a  simplification  of  Richards’  equation,  which  is  used  to 
model  groundwater  transport  in  the  unsaturated  zone.  This  time-dependent 
model  in  two  space  dimensions  serves  as  a  window  to  observe  the  HONK 
algorithms  in  action  for  subsurface  simulation  applications.  We  believe  that 
this  (or  a  similar)  algorithm  should  benefit  reservoir  simulations  in  use  by 
the  petroleum  and  environmental  industries. 
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9.1  Richards’  equation 

This  example  problem  models  that  infiltration  of  the  near  surface  under¬ 
ground  zone  in  a  vertical  cross-section.  This  is  a  case  of  unsaturated  flow 
that  takes  place  in  the  region  between  the  ground  surface  and  the  water 
table,  i.e.,  the  so  called  vadose  zone  the  flow  is  driven  by  gravity  and  the 
transport  coefficients  are  modelled  by  empirical  nonlinear  functions  of  the 
moisture  content  below  saturation  conditions.  Thus,  the  model  equation  for 
this  two-dimensional  flow  is  given  by 


%  -V-[£>(c)-Vc] 


dK  (c) 
dz 


=  0, 


where  c  is  the  underground  moisture  content,  D  (c)  is  the  dispersivity  coef¬ 
ficient  and  K  (c)  is  the  hydraulic  conductivity.  This  equation  is  a  simplifica¬ 
tion  of  the  well  known  Richards’  equation  which  also  presents  nonlinearities 
in  the  transient  term. 

The  boundary  conditions  for  this  model  are  of  Dirichlet  type  at  the  sur¬ 
face,  where  a  constant  water  content  of  unity  is  kept  at  all  times,  and  zero 
Neumann  on  the  remaining  three  sides  of  the  model.  These  conditions  are 
given  by 


c  =  cs,  at  z  =  0, 

0  <  y  <  1, 

dc 

—  =  0,  at  z  =  1, 
oz 

0  <  y  <  1, 

dc 

dy 

=  0,  at  y  =  0  and 

V 

o 

T— 1 

II 

for  t  >  0.  Here,  z  represents  the  vertical  direction  and  y  represents  the 
chosen  horizontal  direction  for  the  cross-section.  The  surface  water  content 
is  represented  by  cs  =  1. 

ft  would  be  appropriate  to  say  that  the  same  hydraulic  conductivity 
plays  a  role  in  both  the  diffusive  and  the  convective  terms,  because  this 
model  is  just  the  continuity  equation  for  the  moisture  content,  where  Darcy’s 
law  (with  a  moisture  content  dependent  hydraulic  conductivity)  has  been 
replaced  for  the  superficial  velocity.  In  fact,  D(c )  is  often  referred  to  as  the 
capillary  diffusivity  and  is  given  by 


D(c)  =  ^r,  with  ip  = 

P 


where  p  and  p  are  the  groundwater  capillary  head  and  density,  respectively. 
However,  different  functional  forms  are  often  used  to  describe  the  dependence 
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Water  Content  Distribution  for  1  st  Time  Step 


Water  Content  Distribution  for  1000th  Time  Step 


Vertical  Distance 


Vertical  Distance 


Figure  5:  Water  content  distribution  at  1st  and  1000t/l  time  steps. 


of  both  coefficients  on  the  subsurface  water  content.  For  this  example,  our 
choices  of  dispersivity  and  hydraulic  conductivity  are,  respectively, 


D{c) 


K0cl 


and  K(c)  =  Kqc£  , 


with 


c-  c0 


where  Co  is  the  irreducible  underground  water  content  and  the  tensor  Kq 
is  a  position  dependent  coefficient  whose  nonzero  values  have  been  chosen 
according  to 

Ko (i,j)  =  p — 5  |  .  ..  >  for  1  <  i,j  <  5. 
b  -  j  |+i 

This  choice  of  Kq.  although  contrived,  is  sometimes  found  in  underground 
formations  and,  represents  a  narrow  channel  of  permeable  rock  where  the 
moisture  is  allowed  to  move.  The  hydraulic  conductivity  at  saturation  is 
proportional  to  the  rock  permeability,  which  has  been  shown  to  change  over 
a  few  orders  of  magnitude  within  relatively  short  distances  in  underground 
formations.  In  our  computational  experiments  we  take  Co  =  0.25,  which 
represents  a  typical  value  of  the  irreducible  water  content.  See  [2]  for  a 
comprehensive  discussion  of  this  model. 

Figure  5  shows  the  solution  for  the  moisture  content  distribution  over 
the  two-dimensional  domain  for  a  mesh  of  16  x  16  at  the  1st  and  1000th  time 
steps  of  simulation.  The  solution  shows  the  effect  of  the  heterogeneity  in  the 
resulting  subsurface  water  content. 
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A  constant  time  step  was  used  for  these  simulations,  which  was  chosen 
small  enough  to  allow  the  inexact  Newton  method  to  converge  within  40 
nonlinear  iterations  and  given  by 

At  =  — h2 . 

16 

This  small  time  step  was  required  in  order  to  use  the  solution  of  the  previous 
time  level  as  an  acceptable  initial  guess  for  the  nonlinear  iteration. 

Figure  6  shows  the  accumulated  (million)  floating  point  operations  for 
all  the  nonlinear  methods  as  the  simulation  progresses  up  to  100  time  steps, 
for  a  discretization  mesh  of  32  x  32.  No  preconditioning  was  used.  The  curve 
clearly  exhibits  the  computational  cost  trend  of  all  nonlinear  methods. 


Figure  6:  Performance  in  accumulated  millions  of  floating  point  operations 
of  Newton’s  method  and  the  HONK  method. 

The  HONK  algorithm  shows  a  significant  saving  in  computational  cost 
from  start  to  end  of  this  short  simulation  (as  depicted  in  Figure  6).  The 
increasing  difficulty  of  the  nonlinear  problems  as  simulation  advances  pro¬ 
duces  a  super  linear  growth  in  the  number  of  floating  point  operations.  This 
growth  is  not  only  due  to  the  complexity  of  the  nonlinear  problems  but  also 
to  that  of  the  linear  problem.  This  is  an  example  where  the  region  of  rapid 
convergence  is  far  from  the  initial  guess  given  at  every  time  step,  causing  un¬ 
expected  difficulties  to  Newton’s  method  before  reaching  that  region.  It  can 
be  observed,  for  this  particular  case,  that  secant  methods  produce  more  effi¬ 
cient  steps  toward  the  solution,  making  them  more  preferable  than  Newton 
type  of  approaches. 

Table  1  shows  a  comparative  convergence  behavior  of  a  suite  of  several 
methods.  The  table  confirms  the  excessive  work  (in  terms  of  nonlinear  itera- 


22 


Table  1:  Total  of  nonlinear  iterations  (NI)  and  GMRES  iterations  (GI)  for 
inexact  versions  of  several  nonlinear  solvers.  The  problem  size  considered  is 
of  size  16  x  16  gridblocks  after  100  time  steps  of  simulation. 


Method 

NI 

GI 

Newton 

1627 

11890 

Comp.  Newton 

835 

12186 

HONK 

391 

2673 

Broyden 

631 

4046 

NEN 

347 

4400 

KEN 

422 

2757 

tions)  carried  out  by  Newton’s  method  and  the  composite  Newton’s  method 
(also  known  as  Shamanskii  method,  a  higher-order  modification  of  Newton’s 
method,  see  [11])  compared  to  the  HONK  algorithm  and  all  secant  methods. 
The  composite  Newton’s  method  halves  the  number  of  nonlinear  iterations 
of  Newton’s  method  but  both  spend  about  the  same  total  number  of  linear 
iterations.  The  HONK  method  reduces  in  half  the  number  of  nonlinear  it¬ 
erations  taken  by  the  composite  Newton’s  method  and,  besides,  it  reduces 
by  an  almost  4-fold  the  total  number  of  linear  iterations  with  respect  to  this 
high-order  Newton  method.  Hence,  the  HONK  algorithm  not  only  tackles 
efficiently  the  nonlinearities  but  also  leads  to  much  easier  linear  problems 
that  arise  at  the  neighborhood  of  the  solution. 

The  NEN  algorithm  also  halves  the  number  of  nonlinear  iterations  shown 
by  Broyden’s  method  but  the  number  of  linear  iterations  accounts  for  the 
similar  computational  efficiency  of  both.  The  nonlinear  KEN  algorithm  takes 
an  intermediate  number  of  nonlinear  iterations  between  these  two  algorithms 
but  its  efficiency  is  marked  by  the  fewer  number  of  displayed  linear  iterations. 

9.2  Air-water  model 

We  consider  the  2-D  Vauclin  problem.  This  problem  was  implemented  by 
Jenkins  [10]  for  validating  the  air-water  model  in  IPARS  (Integrated  Parallel 
Accurate  Reservoir  Simulator)  [18]. 

The  dimensions  of  the  problem  are  20 ft  X  400 ft  X  20 ft,  with  corre¬ 
sponding  number  of  gridblocks  10  X  20  X  1  along  Z ,  Y  and  A;  that  is,  a  2-D 

case  in  the  plane  Z - Y.  Initial  pressures  and  saturations  are  computed  at 

equilibrium  from  a  pressure  value  of  503.94511  psi  and  a  saturation  of  .5. 

In  order  simulate  evaporation  and  infiltration  processes  accordingly,  air 
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Relative  Permeability  Capillary  Pressure 


Figure  7:  Saturation  curves  for  the  Vauclin  problem. 


and  water  production  wells  are  placed  in  one  extreme  of  the  domain  and 
one  water  injection  well  at  the  opposite  end.  Bottom  hole  pressures  are 
specified  to  maintain  a  constant  pressure  in  the  system  after  10  days  of 
simulation.  The  system  has  an  anisotropic  definition  of  permeability  with 
higher  values  at  the  side  of  the  ditch  (where  the  pumping/evaporation  takes 
place).  Porosity  values  are  held  constant  to  20%  everywhere.  Saturation 
curve  data  are  shown  in  Fig.  7.  For  representing  the  pressure  head  at  the 
ditch,  the  capillary  pressure  curve  shown  at  the  right  of  Fig.  7  is  decreased 
by  2  orders  of  magnitude. 


Water  pressure  contour  at  time  step  t=.5  days 
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503 
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Figure  8:  Pressure  (left)  and  saturation  (right)  contour  of  the  Vauclin  prob¬ 
lem  at  t  =  .5  days. 

Figure  8  shows  the  field  pressure  and  held  saturation  values  at  the  begin¬ 
ning  of  the  simulation.  Figure  9  shows  the  resulting  pressure  and  saturation 
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Figure  9:  Pressure  (left)  and  saturation  (right)  contour  of  the  Vauclin  prob¬ 
lem  at  t  =  1900  days. 


field  after  1900  days  of  simulation.  We  can  clearly  observed  that  the  recharge 
area  is  almost  linear  due  to  the  pumping  effect  caused  by  the  well  located  at 
the  right  side  of  the  reservoir.  As  expected  pressures  are  higher  at  the  left 
bottom  extreme  due  to  the  infiltration  zone  in  the  lower  permeability  zone. 

The  associated  linear  systems  (matrices  and  right  hand  side  vectors)  for 
the  Vauclin  test  problem  were  generated  by  the  two-phase  option  of  IPARS. 
The  data  for  the  tests  were  downloaded  from  the  simulation  after  1  time 
step  (0.5  days)  and  after  3064  time  steps  (1900  days).  Implementation  of 
the  different  preconditioning  options  were  performed  in  MATLAB  v6.1  on  a 
Linux  based  PC  machine. 


x  10s  xio10 


Figure  10:  Spectra  of  the  blocks  composing  the  sample  Jacobian  matrix  from 
the  air-water  case. 
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Figure  11:  Spectrum  of  the  sample  Jacobian  matrix  from  the  air-water  case 


Figure  10  shows  the  spectrum  of  each  of  the  blocks  involved  in  the  Ja¬ 
cobian  matrix  obtained  in  the  first  nonlinear  iteration  at  t  =  1900  days. 
Looking  at  Fig.  11  we  can  verify  the  resemblance  between  the  spectrum  of 
Jss  and  that  of  the  Jacobian  matrix  J.  Therefore,  the  conditioning  of  block 
Jss  has  an  immediate  incidence  on  the  conditioning  of  the  whole  system. 
In  this  respect,  we  can  expect  the  decoupling  operator  D~1  to  do  a  better 
preconditioning  job  than  the  Wp  operator  (see  discussion  at  the  end  of  sec¬ 
tion  6).  Moreover,  the  resulting  decoupled  matrix  is  weak  block  diagonally 
dominant  (i.e.,  the  spectral  radius  is  bounded  by  one)  and  therefore  classical 
block  iterative  methods  converge  for  this  matrix  for  any  right  hand  side  and 
initial  guess  [13].  We  also  note  that  despite  the  small  size  of  the  problem,  the 
condition  number  for  the  Jacobian  matrix  J  is  very  high.  Also,  J  is  highly 
indefinite. 

In  Figure  12  we  show  the  spectrum  of  the  resulting  Jacobian  matrix  after 
applying  the  decoupling  operators  D~1  and  W.  Interestingly  enough,  the 
Jacobian  spectrum  has  been  significantly  compressed  and  shifted  to  the  right 
half  of  the  complex  plane  by  the  action  of  ZW1 .  In  contrast,  strategies  that 
intent  to  preserve  as  much  as  possible  the  original  structure  of  the  matrix 
perform  very  poorly  as  preconditioners  (see  the  great  resemblance  between 
the  spectrum  of  IF  J  and  J.)  Several  experiments  like  this  one  have  indicated 
that  the  best  strategy  is  to  break  as  much  as  possible  the  coupling  between 
equations  (or  unknowns)  than  trying  to  preserve  some  desirable  properties 
of  the  individual  blocks. 

Figure  13  (top)  shows  the  spectrum  of  the  combinative  operator  applied 
to  the  Jacobian  matrix  for  an  exact  solution  of  the  pressure  system.  In  this 
particular  example,  M  was  taken  to  be  the  diagonal  (i.e.  jacobi  precondi- 
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Figure  12:  Spectrum  of  the  decoupled  sample  Jacobian  matrix  from  the 
air-water  case:  Applying  the  operator  Wp  (top)  and  D~1  (bottom). 


tioner)  part  of  J.  For  the  additive  and  multiplicative  two-stage  precondi¬ 
tioner,  we  specify  M  to  be  the  identity  matrix.  The  preconditioner  does  a 
good  job  in  clustering  the  eigenvalues,  although  we  can  predict  that  there 
are  some  small  ones  that  will  affect  the  convergence  of  an  Krylov  iterative 
solver.  In  contrast,  Fig.  13  also  shows  how  the  two-stage  additive  (middle) 
and  two-stage  multiplicative  (bottom)  perform  a  more  efficient  task  in  clus¬ 
tering  the  spectrum  around  the  point  (1,0)  in  the  complex  plane.  Note  that 
the  multiplicative  two-stage  preconditioner  produces  the  major  clustering  of 
the  real  parts  of  the  eigenvalues  around  unity  among  the  three  two-stage 
preconditioners. 

Looking  at  the  convergence  of  GMRES  with  different  preconditioners 
some  general  comments  are  in  order  (see  Table  2).  Linear  tolerance  was 
set  to  machine  precision:  1.10”1'.  The  traditional  preconditioners  do  not 
take  into  account  any  of  the  physics  of  the  multi-phase  model  and  either 
fail  to  converge  or  are  outperformed  by  some  of  the  more  thoughtful  two- 
stage  preconditioners.  In  particular  ILU(O)  and  block  Jacobi  do  not  make 
GMRES  to  converge  at  the  required  precision  within  400  iterations.  These 
two  preconditioners  present  difficulties  to  resolve  the  low  error  frequencies 
of  the  solution.  We  do  not  include  them  in  the  table. 
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Figure  13:  Spectrum  of  the  Jacobian  matrix  preconditioned  with  the  Mcom b 
operator  (top),  the  Ma( m  operator  (middle)  and  Mmuit  operator  (bottom). 
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As  was  expected  from  the  spectrum  information,  the  two-stage  multi¬ 
plicative  preconditioner  is  the  most  effective  one.  This  trend  was  observed 
for  different  simulation  times  and  time  step  sizes.  The  additive  and  mul¬ 
tiplicative  version  outperformed  the  combinative  one  in  all  cases  and  more 
noticeable  performance  differences  were  observed  as  the  time  step  was  in¬ 
creased. 

However,  all  of  the  two-stage  (except  for  the  2SComb)  preconditioners 
give  a  smaller  number  of  outer  iterations  for  the  longer  time  step.  The  key 
in  interpreting  these  results  is  in  the  action  of  the  full-decoupling  operator 
implemented  for  the  two-stage  preconditioners  and  its  own  power  to  precon¬ 
dition  the  system.  We  believe  that  the  weight  of  the  off-diagonal  Jacobian 
blocks  after  full  decoupling  is  less  for  the  longer  time  step  than  for  the  shorter 
one  and  the  preconditioner  is  more  effective  as  a  result. 


Table  2:  Total  of  linear  iterations  for  GMRES  with  different  two-stage  pre¬ 
conditioners. 


Preconditioner 

t  =  .5  days 

t  =  1900  days 

Combinative 

20 

119 

Additive 

14 

24 

Multiplicative 

13 

16 

10  Conclusions  and  Further  Work 

Two  novel  Newton-Krylov  algorithms  were  proposed  for  the  inexact  solution 
of  nonlinear  systems  of  algebraic  equations  with  potential  for  higher-order 
convergence  when  applied  to  a  variety  of  problems.  Both  of  these  methods 
are  based  on  implicit  Krylov-secant  updates  of  the  Jacobian  matrix  (i.e., 
Broyden  updates  of  the  Hessenberg  matrix  resulting  from  the  Arnoldi  de¬ 
composition  in  GMRES).  The  nonlinear  KEN  algorithm  is  an  inexact  form 
of  that  proposed  recently  by  Yang  with  the  added  efficiency  of  Broyden  up¬ 
dates  on  the  lower-dimensional  Hessenberg  matrix.  This  method  shows  a 
computational  performance  which  is  far  superior  to  that  of  either  Broyden’s 
or  the  NEN  algorithms.  The  HONK  method  uses  the  Krylov-Broyden  up¬ 
dates  to  generate  a  series  of  chord  steps  restricted  to  the  Krylov  subspace, 
which  results  in  an  extremely  efficient  secant  version  of  the  inexact  com¬ 
posite  Newton  implementation  (potentially  with  order  higher  than  2).  This 
algorithm  outperforms  both  the  Newton  and  standard  composite  Newton 
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method  in  terms  of  floating  point  operations  required  for  convergence  to  the 
same  tolerance. 

The  possibilities  for  extending  both  methods  are  enormous.  First,  more 
robust  versions  can  be  developed  by  enriching  the  underlying  Krylov  infor¬ 
mation.  This  can  be  realized  by  augmenting,  orthogonalizing  and  deflating 
(preconditioning)  the  Krylov  basis.  Advances  along  these  lines  have  been 
useful  to  develop  Krylov  subspace  method  for  solving  linear  systems  with 
different  right  hand  sides  and  a  changing  sequence  of  linear  systems  in  both 
matrix  and  the  right  hand  side  [20,  22]. 

A  second  source  of  improvement  has  to  do  with  the  projection  of  Krylov 
information  in  successive  time  steps.  The  idea  will  be  to  obtain  a  better 
initial  guess  for  the  nonlinear  method.  This  will  also  imply  an  increase  of 
the  region  of  convergence  and  a  further  accelerating  scope  for  both  the  KEN 
and  HONK  methods.  Attempts  to  generate  stable  time  stepping  from  Krylov 
basis  information  has  been  reported  in  [3]. 

Our  numerical  results  show  that  the  new  preconditioners  outperform  a 
few  traditional  approaches  proposed  in  the  literature  of  environmental  and 
reservoir  engineering:  ILU(O)  and  block  Jacobi  preconditioner  (for  precon¬ 
ditioning  of  the  entire  system),  and  an  inexact  version  of  the  combinative 
approach  (originally  developed  for  coupled  linear  systems). 

We  also  discussed,  in  a  algebraic  fashion,  the  use  of  constraint  operators 
for  performing  semi-  and  supercoarsening  multigrid  methods.  The  use  of 
these  operators  are  effective  whenever  the  contrast  of  coefficients  is  not  severe 
along  the  direction  where  the  coarsening  is  performed.  Nevertheless,  the 
success  of  these  methods  will  depend  upon  the  interplay  that  can  be  built  in 
deflating  eigenvalues  between  the  Krylov  iterative  method  and  the  projection 
of  the  problem  into  smaller  subproblems. 

We  consider  that  the  following  issues  on  two-stage  preconditioners  and 
constraint  operators  need  to  be  addressed  in  the  future: 

•  Dynamic  characterization  of  the  tolerances  controlling  the  inner  com¬ 
ponent  solvers  of  the  preconditioner. 

•  Theoretical  analysis  and  extension  of  the  preconditioners  to  several 
unknowns  per  gridblock.  It  is  important  to  know  what  choice  of  pri¬ 
mary  variables  and  what  properties  can  be  exploited  when  solving  large 
coupled  systems  of  nonlinear  equations.  We  have  observed  that  some 
decoupled  blocks  are  easier  to  solve  than  others,  so  this  may  determine 
the  type  of  linear  solvers  to  be  used  within  the  preconditioner. 
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•  Study  the  deflation  properties  of  these  methods  and  their  complemen¬ 
tarity  with  deflating  Krylov  iterative  methods  (see,  e.g.,  [4,  19]). 

The  results  obtained  are  a  valuable  insight  for  solving  3-D  air-water  prob¬ 
lems  with  discontinuous  Galerkin  (DG)  methods.  Currently,  our  research 
group  is  studying  two  level  domain  decomposition  approaches  for  solving 
pressures  and  local  DG  for  saturations.  The  decoupling  framework  devel¬ 
oped  here  serves  to  the  separate  treatment  of  both  physical  entities. 

In  the  case  of  coupling  flow  and  geomechanics,  the  linear  system  dramat¬ 
ically  increases  in  complexity.  Besides  pressures  and  saturations,  the  linear 
system  includes  equations  for  the  linearized  poro-elasticity  model  and  dis¬ 
placements  need  also  to  be  solved.  However,  simple  decoupling  arguments 
as  the  ones  described  here  with  a  judicious  application  of  quadrature  rules 
can  reduce  the  27-point  stencil  to  a  7-point  stencil.  An  algebraic  multigrid 
algorithm  is  used  to  solve  each  displacement  component.  To  reduce  the 
CPU  time  consumed  in  the  setup  phase  of  AMG,  we  have  proposed  to  use 
AMG  for  solving  the  residual  corrections  in  an  aggregated  subspace.  Accord¬ 
ing  to  preliminary  numerical  experiments,  the  resulting  preconditioners  are 
very  robust  with  respect  to  both  increasing  Poisson  ratio  and  large  jumps 
in  coefficients.  It’s  convergence  is  fairly  stable  with  respect  to  mesh  param¬ 
eter  h.  High-order  Newton-Krylov  method  should  also  aid  at  reducing  the 
turnaround  time. 

In  general,  further  computational  experiences  are  required.  Our  results 
are  promising  but  need  to  be  evaluated  under  more  stringent  physical  situ¬ 
ations  and  at  a  larger  scale.  Among  target  applications  we  consider  useful 
the  insights  coming  from  three-phase,  compositional  and  their  coupling  to 
geomechanics  and  chemical  processes.  The  different  strengths  in  nonlinear i- 
ties  between  the  typical  coupled  equations  of  the  subsurface  model  (i.e.,  one 
for  the  pressure  potential  or  flow  driving  force  and  another  or  others  for  the 
saturations/  concentrations)  and  the  coupling  itself  between  reservoir  vari¬ 
ables  are  sufficiently  good  reasons  to  investigate  more  on  the  preconditioning 
ideas  displayed  in  this  project. 
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